Querying NoSQL-based Crowdsourcing Systems Efficiently

نویسندگان

  • Alfredo Cuzzocrea
  • Marcello Di Stefano
  • Paolo Fosci
  • Giuseppe Psaila
چکیده

In this paper, we provide a novel approach for effectively and efficiently support query processing tasks in novel NoSQL crowdsourcing systems. The idea of our method is to exploit the social knowledge available from reviews about products of any kind, freely provided by customers through specialized web sites. We thus define a NoSQL database system for large collections of product reviews, where queries can be expressed in terms of natural language sentences whose answers are modeled as lists of products ranked based on the relevance of reviews w.r.t. the natural language sentences. The best ranked products in the result list can be seen as the best hints for the user based on crowd opinions (the reviews). By exploiting the well-known IMDb dataset, which comprises more than 2 million reviews for more than 100,000 movies, we experimentally shows that our prototype obtains good performance in terms of execution time, demonstrating that our approach is feasible.

برای دانلود رایگان متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

An Approach of SQL to JSON Transformation For Handling Database Operations

Nowadays NOSQL databases are becoming more popular. Companies like Google, Facebook, and Amazon has created their own NOSQL databases based on their requirements. Different types of querying approaches are followed by different NOSQL databases, whereas traditional databases like MySQL, ORACLE, etc. follows SQL for querying. Most of the companies are shifting from traditional databases to NOSQL ...

متن کامل

Distributed RDF Triple Store Using HBase and Hive

The growth of web data has presented new challenges regarding the ability to effectively query RDF data. Traditional relational database systems efficiently scale and query distributed data. With the development of Hadoop its implementation of the MapReduce Framework along with HBase, a NoSQL data store, the semantics of processing and querying data has changed. Given the existing structure of ...

متن کامل

Towards Schema-independent Querying on Document Data Stores

Document is a pervasive semi-structured data model in today’s Web and the Internet of Things (IoT) applications where the data structure is rapidly evolving over time. NoSQL documentoriented databases are well-tailored to efficiently load and manage massive collections of heterogeneous documents without any prior structural validations. However, this flexibility becomes a serious challenge whil...

متن کامل

A Domain-Independent Model for Capturing Annotation Data and Provenance Modeling and Representation of CrowdTruth

With the advent of crowdsourcing as a viable method for data collection, the need for integrated annotation frameworks has increased. The relatively low cost and scalability of crowdsourcing methods allow for various types of annotation tasks to be run in different domains across different data modalities. From these tasks, a large amount of unprecedented data follows which needs to be processe...

متن کامل

Dial M for Management: Next Generation NoSQL

NoSQL databases offer a powerful and flexible means of querying non-relational data. However, leading NoSQL systems typically achieve high performance goals while minimizing support for traditional data management services and defying the establishment of solid formal models. In particular, the systems generally shun tools and design principles rendered conventional by the long history of RDBMS...

متن کامل

ذخیره در منابع من


  با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

عنوان ژورنال:

دوره   شماره 

صفحات  -

تاریخ انتشار 2016